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ABSTRACT 

A general  theory  is  developed  for  the  estimation  of  linear  functionals, 
in  three  distinct  classes  of  nonlinear  problems.  The  functional  is  linear  in 
the  solution  vector  xQ  of  the  problem,  an  example  being  <xQ,pl  where  p 


is  assignable. 

The  considered  problems  are  all  generated  via  the  gradients  of  some 


given  quadratic  or  non-quadratic  Lagrangian  functional  over  two  inner 
product  spaces.  This  may  be  a saddle  functional,  or  it  may  be  constructed  by 
embedding  a given  nonlinear  problem  with  the  aid  of  a Lagrange  multiplier. 
Many  different  problems  in  applied  mathematics  are  thereby  included. 

In  some  cases  the  assignable  coefficient  can  be  chosen  in  such  a way 


that  the  bounds  calculated  for  the  linear  functional  are  pointwise  bounds  on 

the  solution  vector.  In  general  this  requires  further  investigation,  but 

estimation  of  the  deflection  at  a point  on  a cantilever  beam  is  illustrated  in  §6. 
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GENERAL  ESTIMATES  FOR  LINEAR  FUNCTIONALS  IN 
NONLINEAR  PROBLEMS 
M.  J.  Sewell  and  B.  Noble 

1 . Introduction 

(ii  Scope  of  the  investigation 

This  paper  presents  the  results  of  a general  study  of  systematic 
methods  for  getting  upper  and  lower  bounds  to  the  solution-values  of 
linear  functionals. 

A new  theoretical  framework  is  set  up  for  this  purpose.  It  contains 
a wide  class  of  linear  and  nonlinear  problems  which  can  be  defined  in 
terms  of  the  gradients  of  some  given  quadratic  or  non-quadratic  generating 
functional.  It  is  often  important  to  be  able  to  construct,  using  an 
assignable  coefficient,  a linear  functional  of  the  solution  of  such  a 
problem,  and  to  estimate  its  value.  This  can  be  related  to  the  problem  of 
finding  pointwise  bounds. 

The  framework  exhibits  in  a natural  way  three  different  types  of 
situations,  requiring  different  methods  which  we  call  the  general  optimiza- 
tion method  (§2),  the  general  embedding  method  (§4),  and  the  nonlinear 
programming  method  (§5). 


This  research  was  sponsored  in  part  by  the  United  States  Army  under  Contract 
No.  DAAG29-75-C-0024,  and  in  part  by  the  University  of  Oxford  Computing 
Laboratory  and  the  University  of  Reading  Mathematics  Department. 
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Underlying  these  situations  is  an  appropriate  generalization  (§3) 
of  the  second  mean  value  theorem,  already  indicated  in  our  earlier  joint 
paper  (Noble  and  Sewell,  1972,  equation  (5.3)1.  In  particular  this  can  be 
used  to  provide  sufficient  conditions  for  satisfying  a saddle  inequality 
in  the  form  proposed  by  Sewell  (1969,  equation  (2.50)).  A saddle 
functional  generates  a wide  class  of  problems  in  applied  mathematics, 
as  described  in  the  papers  cited  and  in  Sewell,  197  3a,  b,  where  elasticity 
and  plasticity  are  treated  in  detail  from  this  viewpoint.  The  general 
optimization  method  applies  to  saddle-generated  problems. 

Under  different  hypotheses  on  the  generating  functional,  such  as 
boundedness  (instead  of  positivity)  of  operators  representing  its  second 
derivatives,  the  saddle  hypothesis  may  be  lost.  In  this  case  the  general 
embedding  method  can  be  available.  We  show  how  it  recovers  some  recent 
results  of  Barnsley  and  Robinson  (1976). 

Problems  generated  by  inserting  a given  scalar  functional  into 
governing  conditions  expressed  as  sets  of  inequalities  are  covered  in 
the  section  on  nonlinear  programming  methods.  They  also  lead  to 
inequalities  on  linear  functionals. 

Remarks  on  applications  are  made  in  §6. 

(ii)  Origin  of  the  research 

This  investigation  began  in  an  attempt  to  generalize  to  nonlinear 
problems  some  approximation  methods  described  by  Fujita  (19  55),  who  gave 
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an  elementary  proof  of  a theorem  of  Kato  ( 1953)  on  pointwise  estimates 
for  a solution  of  the  linear  decomposable  operator  equation 

T Tx  = f . (1.1) 

Here  x is  the  unknown  and  f is  given,  both  'vectors'  in  the  same 
linear  space.  T is  a linear  operator  and  T is  its  adjoint.  For  example, 
if  T~  grad  and  T ~ - div,  (1.1)  is  associated  with  Poisson's  equation. 
Fujita's  paper  subsumes  in  a compact  way  earlier  work  on  pointwise  bounds 
by  Diaz,  Greenberg  and  Weinstein,  Prager  and  Synge,  and  others  (see 
the  references  in  Fujita's  paper).  It  is  convenient  to  recapitulate  here 
some  of  Fujita's  conclusions,  as  an  introduction  to  some  of  the  ideas 
required  later  on. 

We  introduce  an  intermediate  variable  u in  order  to  decompose 
the  problem  (1.1)  into  the  pair  of  operator  equations 

(a) 

(1.2) 


T u = -f  , 

(a) 

T x = -u  . 

(P) 

Both  here,  and  in  the  main  general  theory  below,  we  regard  the  variable 
x as  an  element  of  a real  vector  space  E having  inner  product  (•,•), 
and  u as  an  element  of  another,  and  normally  different,  real  vector 
space  F having  inner  product  (•,•).  The  linear  operators  map 
subspaces  E'  and  F'  of  E and  F (respectively)  according  to  the 
scheme 


T : E' 


T : F'  — • E . 


(1.3) 


-3- 


Mutual  adjointness  of  these  two  operators  means  that 

(x,T  u)  = (u,Tx)  i 

for  all  x in  E'  and  all  u in  F'.  For  example,  when  differential 

* 

operators  form  part  of  T and  T , (1.4)  is  a compact  way  of  writing 

the  integration  by  parts  formula.  Many  examples  of  these  and  other 
relevant  simple  ideas  from  functional  analysis  are  given  in  an  Appendix 
to  the  paper  of  Noble  and  Sewell  (op.  cit.). 

We  emphasize  those  values  of  x and  u which  satisfy  both 
(1.20-)  and  (1.2(3)  by  xQ,  uQ,  i.e.  by  attaching  a subscript  zero.  Thus 
Xg  is  an  actual  solution  of  (1.1).  Let  u be  any  solution  of  the  single 
constraint  (1.2 a).  Let  x^,  u^  be  any  pair  satisfying  only  (1.20), 
so  that  Xp  is  an  arbitrary  vector  in  the  domain  of  T and  generates 
a consequent  u . In  other  words 

T\  = -f’  TX(3  = -UP  * ( 

In  general  u.  * u unless  both  are  u„  belonging  to  the  actual  solution, 
pa  0 

Then  Fujita's  conclusions  can  be  summarized  as  follows. 

(a)  The  dual  extremum  principles,  giving  what  can  be  called  upper  and 

lower  'energy"  bounds  in  appropriate  contexts,  are  (Fujita,  equation 
(2.3)) 


iKII2>i(x0,f).i||Tx0l|2>(xp.£)-i||lxpl|2. 

The  norms  here  are  all  in  the  space  F,  but  later  on  it  will  not 
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cause  confusion  to  use  the  same  symbol  for  norms  of  elements  of 
E.  These  principles  bound  a linear  functional  of  xQ,  with  given 
coefficient  f in  E.  This  can  be  the  actual  work  of  given  forces 
in  mechanical  problems. 

(b)  If  q is  an  arbitrary  vector  in  F',  the  linear  functional 

<u0,q>  = - (xQ,T*q) 

is  bounded  on  both  sides  by  (Fujita  (3.6)) 


and  also  by  (Fujita  (3.8)) 

II u - u ||  ||q  ||  > |<u  - u q>  I (1.8) 

a (3  p 0 

and 

II u - u II  II q ))  > )<u  - u q)  | . (1.9) 

a p a 0 

We  call  (1.8)  and  (1.9)  Fujita's  'weak'  estimates  and  (1.7)  his  'strong' 
estimate  because  more  is  given  away  to  get  the  weak  inequalities 
than  the  strong  one.  We  shall  recover  some  of  these  results  below, 
by  proofs  different  from  those  of  Fujita,  as  simple  illustrations  of 
our  framework. 

Equations  (1.7)  - (1.9)  suggest  the  following  approach  to  the  problem 
of  obtaining  pointwise  bounds.  Remembering  that  uQ  = -TxQ,  choose  q 

2j> 

so  that  T q has  a delta  function  behavior  in  such  a way  that 

<u0,q>  = -(x0,T*q)  = -(xQ)p  (1.10) 
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where  (x„)  denotes  the  value  of  the  exact  solution  x„  at  the  point  P. 

Op  0 

Then  (1.9',  for  example,  gives 

II u - u ||  || q ||  > |(x  ) + (u  ,q>  | , (1.11) 

a p Op  a 

and  we  have  found  pointwise  bounds  on  x(J  at  P.  Ibis  procedure  is 
useful  for  one-dimensional  problems  and  in  two  and  three  dimensional 
problems  for  bounding  quantities  on  the  boundaries.  But  if,  for  instance, 
we  trv  to  bound  the  potential  at  an  interior  point  in  a problem  involving 
Poisson's  equation,  g has  to  behave  like  grad(l/r)  near  this  point 
and  || q I!  involves  a divergent  integral.  This  difficulty  has  been 
circumvented  by  various  authors  in  an  ingenious  way,  the  essence  of 

which  depends  on  choosing  q to  have  the  form  q = q'  - Tp1,  where 

# * 

T q'  and  T Tp'  have  exactly  the  same  type  of  6-function  behavior, 
with  q'  such  that  (1.10)  is  true  with  q'  in  place  of  q,  and  p'  is 
in  the  domain  of  T.  We  can  deduce  from  (1.9) 

l!  u - U If  |i  q f - Tp'  ||  > !(u  , q)  - (f,  p')  + (x  ) I . (1.12) 

a p ~~  P Up 

The  expression  on  the  left  is  finite  since  q1  and  Tp1  are  chosen  so 
that  their  singularities  at  P cancel.  A numerical  example  is  discussed 
in  Fujita  (1955). 

Although  we  have  been  able  to  obtain  pointwise  bounds  in  a number 
concrete  nonlinear  problems  by  essentially  generalizing  (1.11),  as  for 
example  in  §6  below,  we  have  not  been  able  to  find  a natural  generalization 
of  (1.12)  in  the  abstract  nonlinear  setting  of  our  work. 
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Our  formalism  throws  light  on  some  bounding  principles  developed 


by  Martin  (see,  for  example  1964,  19661  in  the  special  context  of  elasticity 
and  creep.  Martin  exploits  ideas  connected  with  energy,  complementary 
energy,  virtual  work,  etc.  We  show  in  §6  that  his  formulae  apply  in  a 
quite  general  context  by  exploiting  simply  convexity  and  the  structure  of 
the  basic  equations.  The  relevance  of  convexity  in  the  dual  extremum 
principles  of  continuum  mechanics  was  originally  pointed  out  by  Hill  (19  56). 
(iiii  General  governing  equations 

Our  general  theory  is  set  in  the  same  two  inner  product  spaces  having 
typical  elements  x in  E and  u in  F which  are  described  after  (1.2). 
We  consider  the  class  of  possibly  nonlinear  problems  of  generalized 
Lagrangian  type 


8L 

8x 

8L 

au 


(a) 


(PI 


(1.13) 


generated  by  a given  functional  L[x,u]  of  x and  u (Sewell,  1973a, 
equations  (25)).  The  partial  gradients  in  (1.13)  are  Gateaux  differentials , 
as  in  the  familiar  process  of  computing  a 'first  variation'  and  picking  out 
the  coefficients  of  increments  in  the  varied  argument.  Thus  the  pair  (1.13) 
is  effecting  a variational  principle. 

The  problem  (1.  2)  is  recovered  from  (1.13)  with  the  special  example 


* 1 

L = (x, T u)  + (f , x)  + - <u,u)  , 


(1.14) 


11 


I 


♦ dX 

T u — , 

dX 


Tx 


ax 

du  ’ 


(o') 


(P) 


(1.16 ; 


of  Hamiltonian  type  proposed  for  study  by  Noble  (1964).  Another  concrete 
example  is  generated  by 

L (x,  T u)  + (f,x)  - \ p(x,x)  + - <u,u> 


(1.17) 


where  p is  a given  scalar.  As  with  (1.2),  it  is  possible  to  eliminate 
u from  (1.13)  with  (1.17)  and  recover  a single  decomposable  generating 
equation 

(T*T  + pllx  = f (1-18) 

where  I is  the  identity  operator.  Dual  extremum  principles  for  this  when 
p > U were  studied  by  Noble  and  Sewell  (op.  cit.  , §14). 

Problems  whose  ab  initio  version  is  nondecomposable  via  an 
intermediate  variable  in  the  above  sense  may  still  be  brought  into  the 
scheme  (1.13)  by  embedding.  For  example,  if  the  ab  initio  equation  is 

N(x)  =0  (1-19) 

where  N is  a possibly  nonlinear  operator,  we  may  seek  to  identify  this 
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equation  as  (1.13P)  by  introducing  a u to  appear  linearly  in  some  L[  x,u] 
like  a Lagrange  multiplier.  This  embedding  prdcedure  induces  a second 
'adjoint'  equation  (1.13a<)  to  be  considered  in  conjunction  with  (1.19), 
and  perhaps  containing  an  assignable  coefficient  in  the  linear  functional 
to  be  estimated.  It  is  in  this  way  that  the  work  of  Barnsley  and  Robinson  (1976) 
is  brought  into  our  framework.  Even  if  the  ab  initio  problem  i_s  decomposable, 
it  may  still  be  embedded  in  the  stated  manner  into  a larger  problem. 

Barnsley  and  Robinson  (1974)  do  this  in  their  study  of  the  linear  equation 
(1. 18  i for  p - 1 . 

In  the  general  problem  (1.13)  certain  additional  hypotheses  are 
required  about  the  general  functional  L[x,u]  . Typically  these  set  bounds 
on  the  second  derivatives.  In  particular  the  functional  may  be  a saddle 
functional.  For  example,  (1.17)  is  strictly  convex  in  u and,  if  p > 0, 
strictly  concave  in  x.  If  p = 0 as  in  (1.14)  it  is  only  weakly  concave 
in  x.  If  p < 0 it  is  not  a saddle  functional.  Such  hypotheses  are 

made  precise  in  the  next  Section. 

Another  source  of  generalization  is  that  the  equations  (1.13)  can 
be  replaced  by  systems  of  inequalities  (inequalities  (33)  and  (34)  of 
Sewell,  197  3a)  and  dual  extremum  principles  can  still  be  proved  under  the 
saddle  hypothesis.  These  sometimes  contain  direct  estimates  for  linear 
functionals  (see  § 5). 


-9- 


W 


2.  General  Optimization  Method 


(it  Saddle  functional 

Suppose  that  L[x,u]  is  a given  saddle  functional  defined  over 
some  domain  in  the  product  space  E v F. 

The  analytical  expression  of  the  saddle  property  is  in  terms  of 


arbitrary  pairs  of  'points'  in  this  domain,  which  we  label  xt  , u and 
x , u and  refer  to  as  the  'plus  point'  and  the  'minus  point'  respectively. 
Then  L[  x,u]  is  called  a (weak]  saddle  functional  if,  for  any  pair  of 


distinct  points  in  its  domain, 


t t ciL  . . 9L  . 

L - L - (x  - x , - — - (u,  - u — > >0. 

+ - * - ’ IV  + - ’ 811  — 


j 


The  subscripts  attached  to  L and  its  gradients  mean  evaluation  at  the 
indicated  points.  Such  a functional  is  concave  with  respect  to  x at 
each  fixed  u,  and  convex  with  respect  to  u at  each  fixed  x - hence 
the  name,  and  Fig.  2.1  is  a schematic  illustration  of  its  individual  cross- 
sections  with  the  spaces  E and  F.  The  weak  inequality  permitted  in 
(2.1)  for  distinct  pairs  of  points  means  that  the  surface  can  contain  linear 
segments  such  as  straight  lines  or  plane  facets.  Otherwise  it  would  be 
called  a strict  saddle  functional. 

This  analytical  statement  of  a saddle  functional  was  given  by 
Sewell  (1969,  equation  (2.  50)).  For  simplicity  in  what  follows  we  adopt 
the  convention  that  the  vertical  bar  attached  to  gradients  is  omitted. 
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Fig.  2.1.  Saddle  functional  L[x,u] 

For  example,  8L/0X  will  denote  the  Gateaux  differential  with  respect 
to  x evaluated  at  the  plus  point  x , u (and  not  merely  a gradient 
with  respect  to  x ). 

Unless  otherwise  stated,  plus  and  minus  points  will  always  be 
arbitrary  points  in  the  domain  of  L,  throughout  the  paper.  Our  entire 
theory  will  rest  on  the  facility  with  which  different  and  especially 
convenient  interpretations  may  be  assigned  to  them.  Such  choices  will 
be  indicated  by  an  appropriate  suffix. 

For  example,  when  L is  used  to  generate  the  governing  equations 
(1.13),  we  can  divide  them  into  the  two  subsets  labelled  [a)  and  ((3). 
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Each  subset  considered  separately  is  an  underdetermined  problem  whose 
solutions  will  be  supposed  known.  They  may  be  easier  to  find  than  the 
solutions  to  (1.13)  itself.  We  shall  use  x , u to  denote  any  solution 


of  (1.13a)  alone,  and  x^,  u^  to  denote  any  solution  of  (1.13p)  alone. 


In  other  words , 


— = 0 
8x  ’ 
a 


f =0. 

9UP 


(2.  2) 


Neither  point  need  satisfy  the  other  equation,  except  when  it  happens  to 
be  a solution  of  the  complete  problem  (1.13). 

In  what  follows  we  shall  often,  for  the  sake  of  emphasis,  denote  an 


actual  solution  point  of  (1.13)  by  xQ,  u^,  and  attach  a subscript  zero 


to  other  quantities  evaluated  there,  as  we  did  for  (1.  2).  Such  a solution 
point  need  not  be  unique. 

(ii)  Dual  extremum  principles 

First  choose  the  particular  interpretations 


x u =x  ,u  and  x ,u  = x u 
+ ’+  a a - - 0 0 


(2.3) 


in  (2.1).  By  (2.  2)  there  follows  immediately  the  stationary  minimum 


principle  L > L_.  Next  choose 
a ~ 0 


Vu+  = Vuo  and  x.>u_  =W 


(2.4) 


in  (2.1).  This  implies  the  stationary  maximum  principle  LQ  > L^.  Thus  we 


arrive  at  the  dual  extremum  principles 


L > L > L. 
a ~ 0 — (3 


(2.  5) 


■ 
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derived  with  increasing  generality  in  our  earlier  papers  (Noble  and  Sewell, 
op.  cit.  , inequalities  (1);  Sewell  1973a,  inequalities  (31)  and  (32)). 

For  problem  (1.1)  they  are  illustrated  by  (1.6)  in  which 

K-ihJ2’  V'V -illV2-  <2 

The  second  order  quantities  given  away  to  get  these  particular  estimates  are 

\ - Lo  =illu«  - uol|2>  Lo  - S = il|uo  - V|Z  ■ u 

The  extremum  principles  (1.6)  can  be  rewritten  as  error  estimates  in  terms 
of  the  difference  between  the  bounds 


L - L = “ ||u  - u II 2 > 

a p Z a p 


7 llu  - u II 2 

Z a 0 


il|u(3  ' U0f|2 


(iii)  General  bounds  for  linear  functionals 


If  we  choose 


in  (2.1),  we  find 


x u arbitrary,  and  x ,u  =x  , u 


r t , 9L  . . , Ik 

W ,x+'  ^7'-  ■ ,xo-  5T 


If  instead  we  choose 


x+,u+  = x0,uQ  and  x ,u  arbitrary, 


then  (2. 1)  gives 


T t i / 9L  y . f 0L  y 

Lo  - L-  + <u-'  ^>2  <V 


Next  we  add  L - L > 0 to  (2.10).  giving 
0 (3  — 

i t / c)L  0 L 

L‘  - V <v  2-  <v  zr 


(2.13) 


Also  we  add  L - L > 0 to  (2.12),  giving 
a u 

L - L + (u  , / > <u  : — ) 

a - ou  — O’  du 


(2.14) 


It  can  be  seen  that  (2.13)  and  (2.14)  offer  bounds  on  the  linear 
0 L q| 

functionals  (xn,  — — ) and  (u^,  ~ — ) of  the  solution  variables  x„,  u . 

0’  3x+  0’  Su  0’  0 

The  bounds  on  the  left  are  in  terms  of  arbitrary  assignable  points  xa  , u4 

or  x , u , and  the  supposedly  known  a-  and  (3-points. 

(iv)  Optimization  of  the  extremum  principles 

The  choices  made  in  (2.3)  and  (2.4)  are  special  choices  of  the 

pairs  of  points  in  (2.1),  made  with  particular  solutions  of  (2.  2 a)  and  (2.  2(3) 

respectively,  and  designed  to  lead  immediately  to  simple  conclusions  (2.  5). 

Such  particular  solutions  need  not  be  unique,  and  in  specific  problems  it 

may  be  possible  to  decrease  L and/or  increase  L.  by  optimizing 

a p 

within  subsets  of  particular  solutions.  In  general  the  problem  is  to  find 
such  subsets. 

In  the  case  of  problem  (1.2),  Fujita  (op.  cit.  §4)  specifies  subsets 
appropriate  for  improving  the  bounds  (2.6).  Here  we  make  a rather  different 
remark  about  that  problem  to  help  motivate  our  subsequent  procedures. 
Noticing  that  the  Lagrangian  (1.14)  implies  that  the  left  side  of  (2.1)  is 
exactly  equal  to 
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■£llu+  - u_  II2  , (2.15) 

we  consider  the  possibility  of  minimizing  this  'square  of  the  error'  among 
those  plus  and  minus  points  which  have  the  property 

u - u = Xu  + pu  - u (2.16) 

+ a p U 

for  disposable  scalars  V and  p.  This  simultaneous  procedure  corresponds 
to  finding  the  minimum  of  an  elliptic  paraboloid.  A special  feature  of 
(1.  2)  leads  the  simultaneous  optimization  to  the  improved  pair  of  dual 
extremum  principles 


L = L > L > L > L 
a a — 0 — (3  — (3 


where 


Q 


(vf) 

2||Tx„  )) 


The  result  of  the  simultaneous  procedure  is  therefore  the  same  as  the 
two  usual  separate  choices  of  first  (trivially)  setting  X = 1,  p = 0, 
and  secondly  setting  X = 0 and  optimizing  the  nonhomogeneous 
with  respect  to  the  scale  factor  p. 

The  special  feature  of  problem  (1.  2)  which  leads  simultaneous 
and  separate  procedures  to  the  same  result  is  an  orthogonality  property 

<Vu„-“o>=0’ 

sje 

i.e.  any  u^  = -Tx^  is  orthogonal  to  the  null  space  of  T , since 

* * 

T u = T u = -f. 
a 0 


(2.17) 


(2.18) 


(2.19) 
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(v)  A general  class  of  competing  vectors 

Looking  ahead  from  (2.16)  to  the  problem  of  optimizing  the  bounds 
for  linear  functionals  for  general  L[  x,u]  , and  noting  that  it  is  desirable 
to  free  Fujita's  proofs  from  their  dependence  on  Schwarz's  inequality 
(which  is  a consequence  of  minimizing  a single  quadratic),  we  propose  to 
study  choices  of  the  plus  and  minus  points  which  have  the  properties 

x - x - rx  + sx  + hp  + ix„  , 

+ - a (3  0 

u+  ' u.  = + kq  + ju0  . 

Here  the  eight  coefficients  r,  s,  h,  i,  \,  k,  j are  disposable  real 
scalars  which  will  be  normalized  here  by  taking  i = ± 1,  j = ± 1 . The 
choices  (2.  3)  and  (2.  4)  are  special  cases  of  (2.  20)  with  h = k = 0. 

The  elements  p in  E and  q in  F are  to  be  regarded  as 
assignable.  Note  that  dual  extremum  principles  such  as  (1.6)  estimate  a 
linear  functional  ~ (x  , f)  whose  coefficient  f was  already  given  in  the 
statement  of  the  problem,  and  was  therefore  not  necessarily  assignable. 
Our  basic  objective  is  to  estimate  a linear  functional  whose  coefficient 
may  be  chosen  without  that  constraint. 

Before  attempting  to  use  the  class  (2.  20)  to  improve  the  general 
bounds  for  linear  functionals  given  in  (2.13)  and  (2.14),  we  notice  one 
more  thing.  Addition  of  the  extremum  principles  to  (2.10)  and  (2.12) 


(2.  20) 


eliminates  the  unknown  L , but  at  the  expense  of  giving  away  the 
first  or  second  term  in  the  identity 
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(2.  21) 


0 = (L  - L ) + (L  - L ) - (L  - L ) . 

a 0 Op  a p 

In  Fujita's  linear  theory  this  is  masked  by  the  special  orthogonality 
property  (2.19)  in  the  form 

0 - v v V i llu«  - uo 11 2 * i K - V1 2 ‘ i K - up 1,2  • 

His  weak  estimates  (1.8)  - (1.9)  require  that  the  first  or  second  of  (2.22 ) 
be  given  away,  and  do  not  therefore  depend  on  the  orthogonality  per  se. 

It  is  therefore  his  weak  estimates  which  we  shall  be  trying  to  generalize 
when  we  optimize  (2.13)  and  (2.14). 

On  the  other  hand,  his  strong  inequality  (1.7)  does  not  give  away 
the  stated  terms  L - L and  L - L but  it  does  seem  to  depend 

a 0 Up 

critically  on  the  orthogonality  property 

H <V  V -uol|2-il|u»' V|2  = <u«-  wV 

= <TX  - V'xo  ■ v = 0 • 

For  this  reason  we  expect  his  strong  inequality  to  be  harder  to  generalize, 
even  though  something  may  be  achieved  in  particular  cases  (see  § 2(x)). 
(vi)  Intermediate  generality 

In  seeking  to  optimize  the  general  bounds  (2.13)  and  (2.14)  on 
linear  functionals,  we  find  it  illuminating  to  concentrate  first  upon  some 
cases  which  are  more  general  than  (1.14)  or  (1.17),  but  less  general 
than  an  arbitrary  saddle  functional  L[x,u]  . These  are  separa ble  cases 
of  type 


(2.  22) 


(2.  23) 
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L (x,  T u)  - N(xi  -1  G(u)  (2.  24) 

in  which  ,M(x)  and  G(u)  are  convex  functionals  of  the  single  variables 
x and  u respectively.  In  passing  we  can  notice  that 

L (x,T  u)  - N(x)G(u)  (2.25) 

is  concave  in  x and  convex  in  u,  provided  both  G > 0 (or  N lineari 
and  N < 0 (or  G linear)  in  addition  to  the  convexity  of  N and  G. 

Examples  of  (2.  24)  in  which  one  of  N(xl  or  G(ui  are  linear 
arise  in  fields  such  as  network  theory  and  elasticity.  In  the  latter  x 
can  be  a generalized  stress  (cf.  Sewell  1973a, b)  or  bending  moment 
entering  a convex  N(x),  with  displacement  u appearing  in  a linear  G(u). 

In  the  next  subsection  we  carry  out  the  optimization  for 
L - (x,  T u)  + (f , x't  + G(u) 

obtained  from  (1.14)  by  letting  G be  any  strictly  convex  functional, 
instead  of  quadratic.  From  (1.13)  this  generates  the  problem 

T u = -f  , (a) 

Tx  = -g(u)  , ((3) 

where 

g(u)  = G'(u)  . 

A prime  will  signify  gradients  of  G(u),  and  also  of  g(u)  below. 

The  inequality  (2.13)  reduces  to 


(2.  26) 


(2. 27) 


* , 9L 

G(u+>- VT  Vf)‘G(V-'(xo’^ 


(2. 28) 
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in  which 


ciL 

8x . 


T u + f, 

4. 


Txp  = -9(up)  • 


Here  u+  is  any  element  in  the  domain  of  T . 
The  inequality  (2.14)  reduces  to 

8L 


9L 


G<v  - a’x-]  * <“->  - Tlt->  - G(u-'  2<V  iT’ > 


in  which 

T u = -f,  ~ - Tx  4 g(u  ) . 

a d u 

Here  x is  any  element  in  the  domain  of  T,  and  u is  any  element 

in  the  domain  of  g(u). 

(vii)  Optimization  of  the  first  bound 

Recalling  (2.  20),  we  choose  for  the  u+  in  (2.28)  the  restricted 

u , = u + kq 
+ a 


for  any  q now  in  the  domain  F'  of  T , and  any  scalar  k.  Then 
(2.29).  with  (2.3l)j  implies 


9L 

ax+ 


= kT  q 


and  (2.  28)  becomes 

G(u  4 kq)  - (xn,  T u_  + f)  - G(u  ) > -k(x  ,T‘q)  . 
a p p P u 

We  now  optimize  this  inequality  approximately  with  respect  to  1 

It  will  turn  out  under  suitable  circumstances  that  k is  small.  Acting 

on  this  assumption  we  write 


(2.  29) 


(2.  30) 
(2.31) 


class 
(2. 32) 


(2.  33) 
(2.  34) 
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G(u  + kq)  G(u  ) + k(g(u  ) q)  + “ k2(g'(u  )q,q>  + 0 ( k 3 ) . i 

a a a i.  a 

Inserting  this  in  (2.  34)  and  omitting  the  higher  order  terms  gives 

^ C + F^k  + B^2  > 0 i 

with  the  following  shorthand  for  the  coefficients 

\c  = G(u  ) - (xft,  T*ua  + f)  - G(u  J - L - L > 0 i 

2 a p p pap—’ 

F h ( g(u  ) - g(u  ) q)  (g(u  ),q>  + (x  T q)  , I 

1 a 0 a U 

iBi  s “<q,g'(ua)q> . 

A sufficient  condition  for  the  strict  convexity  of  G(u)  can  be  given 
in  terms  of  a mean  value  theorem  (see  § 3),  and  implies  that 

C h <u  - u g'(u)(u  - u )>  > 0 I 

a p a p 

when  u and  u are  distinct,  where  the  operator  g • ( u)  is  evaluated 
a p 

at  some  intermediate  u between  u and  u„.  It  also  implies  the  strict 

a p 


inequality 


for  q t 0 . 


B > 0 


Under  these  two  strict  inequalities,  we  optimize  (2.  36)  with  respect 
to  k by  considering  two  cases. 

(a)  k > 0 implies 


■T-  + 2F  + B k > 0 . 
k 11  — 
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x 

The  left  side  is  least,  e.g.  by  completing  the  square,  at  k (C/B^i2 
and  the  best  result  is 

i 

F + (CB  )*  > 0 . 

(b)  k < 0 implies 

(2.43i 

0 > ~ + 2F  + B k . 

- k 11 

1 

The  right  side  is  greatest  at  k -(C/B  )*,  and  the  best  result  is 

(2.  44) 

1 

0 > F - (CB  )2  . 

Inequalities  (2.43)  and  (2.45'  for  the  objective  linear  functional 

(2.45) 

(x0,Tq)  = -( gfUp) , q) 

of  the  solution  variable  xQ  can  be  summarized  as 

(2.46) 

l(g(ua),q)+  (x0,T'q)|  < (CBj)'2  . 

It  has  to  be  remembered  that  this  result  is  not  rigorous  because 
higher  order  terms  were  omitted  in  going  from  (2.  35)  to  (2.  36).  Rigorous 
bounds  can  be  obtained  by  inserting 

(2.47) 

k rifC/Bj)2 

into  (2.34).  These  values  of  k will  not  in  general  provide  the  best 

(2.48) 

bounds  on  (2.46),  but  if  u ~ u.  they  will  be  close  to  the  optimum  bounds 

a ““  p 

because  C = 2(L  - LJ  will  then  be  small.  Without  loss  of  generality 

a p 

we  can  assume  that  B^  is  of  order  unity  so  that  the  resulting  k is 
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h. . . 

. 

small,  and  the  neglect  of  higher  order  terms  in  (2.  36)  will  be  justified. 

Obviously  the  best  bounds  are  obtained  when  u and  u are  as 

a (3 

nearly  equal  as  possible,  since  as  u — u.,  C — 0 and  the  left  side 

a p 

of  (2.47)  tends  to  zero. 


The  linear  problem  (1.  2)  is  recovered  with 


G(u)  -<u,u),  g(u)  u,  g'(u)  I 


(2.49) 


so  that 


c K * uJ  . Bi  h 


The  result  (2.47)  is  then  exact,  namely 


ku  - U q)  | < liu  - u I!  || q 
a 0 a p 


which  is  one  of  Fujita's  weak  estimates. 


(viii)  Optimization  of  the  second  bound 


Inequality  (2.  30)  is  the  basis  for  the  second  bound,  and  it  involves 
both  x and  u . Again  recalling  (2.20)  we  choose  the  class  of  points 


x_  xp  + hp, 


u_  Up  + kq 


(2.  50) 


for  any  p in  the  domain  E1  of  T,  any  q such  that  u (like  u^ 
is  in  the  domain  of  g(u),  and  any  scalars  h and  k. 

Insertion  into  (2.  31) ^ * expanding  and  using  (2.29)  implies 


~~  hTp  + kg'  (Up)q  + 0(k2) 


(2.  51) 


with  an  obvious  extension  of  the  0(k  ) notation.  Here  9'(up'  an 
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operator  acting  on  q.  The  inequality  (2.  30)  expanded  about  the  /3-point 

3 

, u^  becomes,  after  omission  of  0(k  ) terms, 

{b/>o 

because  the  terms  in  h cancel  exactly.  Here  C = 2 (L  - L ) as  in 

a (3 

(2.  37),  but  the  other  coefficients  are  now 

F2S<U(i  ' V9'(Vq>’ 

2B2  “ 2 * 

The  form  of  these  shows  that  it  would  be  enough  to  regard  g'(u  )q  Q 
(say)  as  arbitrary,  i.e.  to  have  begun  with  q=[g'(u^)]  in  (2.50). 

Optimizing  (2.  52)  exactly  as  before,  we  find 

l<V  VQ>l-(CB2,i 

in  place  of  (2.47).  Again  this  is  approximate,  but  rigorous  bounds  can 
be  obtained  by  using 


k = ± (C/B. 


to  deduce  exact  values  from  (2.  50)  for  substitution  in  (2.  30). 

In  the  linear  problem  (1.  2)  we  have  only  to  put  g'(u)  = I in  (2.  53) 

and  (2.  54)  to  see  that  (2.  55)  becomes 

l<u  - u ,q>  I < II u - u II  || q II 
(3  0 — a p 


which  is  Fujita's  other  weak  estimate. 
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iv)  Optimization  in  the  general  case  for  weak  bounds 


Here  we  indicate  the  optimization  procedure  that  would  be  required 
in  the  general  case  for  weak  bounds,  i.e.  as  it  would  apply  to  (2.13) 
and  (2. 14). 

At  certain  points  we  shall  need  to  suppose  that  a 'Taylor'  expansion 
of  the  following  type  can  be  employed  (see  also  the  discussion  in  § 3(i)): 

L[x  + i,  u + u]  --  L[  x,u]  + U,  “)  + <U,  7^> 


2t 

( i ) 

axau 


+ higher  order  terms.  (2.57) 

In  this  expansion  on  the  product  space  E X F,  the  linear  operators  act 
on  the  elements  of  E or  F which  follow  them  as  before,  and  with  the 


assumed  property  that 


l * 82L  . 

(S'  Sx8Uu) 


, 8ZL  , 


This  last  property  is  exemplified  by  the  adjointness  statement  (1.4)  for 
the  particular  bilinear  functional  L = (x,T  u)  = (u,Tx).  In  general 

8 L/0x3u  * 0 L/0u0x,  as  T 4 T illustrates.  It  is  assumed  that 

2 2 2 2 
3 L/ax  and  3 L/8u  are  self-adjoint. 

First  we  wish  to  optimize  (2.13)  with  respect  to  all  plus  points  in 

a suitably  chosen  family.  We  consider  a family  centered  on  the  o-point, 

i.e.  we  choose  (cf.  (2.  20)  and  (2.  32)) 
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X x + hp  u u -i  kq 
+ ot  + a 


(2.  59; 


and  optimize  with  respect  to  the  scalars  h and  k. 

The  expansion  of  L+  according  to  (2.  57),  and  the  operator  expansion 
of  8L/8x+,  are  simplified  because  8L/8x  0 by  (2.2).  They  are 

a 1 


L L + k(q 


Ot  fIX  n !iii 


(2.60; 


aL  82L  a , 82L 

7 1 h 7 p + k 7 — ; a + 

dX  , 2 dXdU 

+ 8x  a 

a 


(2.61) 


The  dots  will  consistently  denote  higher  order  terms  in  h and  k.  Using 
(2.6b  to  eliminate  the  cross -derivatives  from  (2.60),  and  substituting 
the  result  back  into  (2.13)  gives 

\ 

L - L - (x  - x ~-)  + k(q  ~ -) 
a p a 0 8x  ’&U 

+ rv 


1.2  / 82L  1 2 . 82L  , 

o h Pi  ; p + 7 k <q,  — 7 q> 


+ • ■ • > o . 


We  can 


also  expand  8L/8u^  about  the  solution  point  xQ,  u , giving 


dL  _ a L 

9ua  ' 8u8xQ  {xa  ' X0 


V+" 


(2.62) 


(2.63) 


since  8L/8u0  = 0.  Insert  this  into  (2.62),  together  with  (2.61).  If  we 

assume  that  all  second  order  derivatives  can  be  taken  at  x ,u  instead 

a a 
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to  the  order  of  accuracy  retained, 


of  x„,  u . 
O'  0’ 


dZL 


- ' Lft  - M - x — — p)  i k(u  - u 
a p a 0 2 a 0 

dx 

a 


dZL 


dU 


q) 


1,2  9 L , 1 , 2,  9 L 

* 2h  (P>  + 2k  (p>  “i  «*> 

dx  du 

a a 


> 0 


Omitting  now  the  higher  order  terms,  (2.64)  is  of  the  form 


C + 2Fjk  + B^2  + 2Gjh  4 A^h2  > 0 


in  which  ~ C L - L,  > 0 by  the  dual  extremum  principles,  as  before. 

2 a p ~ ’ 

Also  > 0 and  A^  > 0 because  L is  concave  in  x and  convex  in  u 


2 2 

If  9 L/9x  "0  so  that  A,  - G,  - 0,  or  if  we  arbitrarily  set 
a 11 

, we  can  choose  k 
in  (2.  36),  and  this  gives 


h 0,  we  can  choose  k in  (2.6  5)  to  find  optimum  bounds  for  F as 


. u - u. 


92L 


a O’  „ 2 

du 

a 


q> 


~2  ' 

2(L  ' L X q , “ q) 

“ 13  3u2 

a 


2 2 

Similarly,  if  9 L/9u  so  that  B = F = 0 or  if  we  arbitrarily  set  k = ( 

a 11 

we  can  choose  h to  obtain  optimum  bounds  for  as 

1 

2 


2 

, 2 

/ 9 L . 

(\'X0’  2 P | 
dX 

a 

< 

- 2(L  - L )(p,  d \ p) 

a S ^ 2 

K ax 

L O'  -J 

If  Aj  > 0 and  > 0 the  inequality  (2.  65)  can  be  rearranged  in  the  form 


Ai 


+ B, 


k - 


Gf  Fi 

c-\  ‘h20 


(2.64) 


(2.65) 


(2.  66) 


(2. 67) 


(2.68) 


(2.69) 


r 


and  the  choice  h - , k F^/B^  leads  to 


A1B1C  i Vf  4 BlGf  ■ 


Inequalities  (2.66)  and  (2.67)  can  be  deduced  from  (2.69). 

A similar  general  analysis  can  be  carried  out  to  optimize  (2.14) 
by  expanding  x , u about  the  (3-point,  in  place  of  (2.  59). 

(x)  Strong  bounds 

We  return  to  the  remark  made  after  (2.  23).  When  it  turns  out  that 

happens  to  be  a linear  functional  of  x^  or  u^,  it  may  be  possible  to 

build  estimates  on  (2.10)  or  (2.12),  without  needing  to  give  away  the 

additional  quantities  L - L or  L - L required  to  arrive  at  (2.13) 

0 6 a 0 

and  (2.14).  Such  estimates  can  be  called  'strong'  in  the  sense  of  (1.7). 

We  may  therefore  envisage  the  optimization  of  (2.10)  and  (2.12)  in  such 
special  uses,  using  ideas  associated  with  those  described  for  (2.13)  and  (2.14). 
In  the  linear  problem  (1.2),  we  can  see  from  (1.6)  that 


L 


0 


(2.70) 


and  (2.10)  reduces  to 

\ <Vu+>  - -(x0,T'u+  + j f'  ( Z-71) 

for  any  u in  the  domain  of  T . The  appropriate  optimization  no  longer 
involves  an  expansion  about  either  the  cr-point  (as  in  (2.  32))  or  the 
(3-point  (as  in  (2.  50)),  but  about  another  point  in  the  family  (2.  20)  midway 
betwee  n them , i . e . 

u = (u  + u ) + kq  . (2.73) 

+ 2 a p 
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Fujita's  strong  estimate  (1.7)  can  now  be  recovered  by  optimizing  (2.71) 
with  respect  to  the  k of  (2.7  3). 

Nothing  new  is  achieved  from  (2.12),  since  this  reduces  to 

2 <u-’u.^  - "(X0>T  u_  + g f)  (2.74) 

for  any  in  the  domain  of  T , which  is  therefore  exactly  the  same 
as  (2.71). 

Instead  of  offering  more  general  theory,  we  give  the  explicit  example 
described  in  § § 6 (ii)  - (v),  in  which  LQ  has  the  linear  form  (6.22). 
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3.  Basic  Theoretical  Framework 


(i'l  Mean  value  theorems 

Underlying  all  of  our  general  theory  is  a result  whose  essence  was 
stated  in  equation  (5.  3)  of  our  earlier  joint  paper  (op.  cit.  ).  To  begin 
with  let  L[x,u]  be  a function  of  two  real  variables,  defined  over  a 
rectangular  domain  which  allows  us  to  join  an  arbitrary  pair  of  points 
x , u,  and  x , u by  a two-segment  path  parallel  to  the  axes  and 
lying  entirely  within  the  domain,  as  in  Fig.  3.1. 


F 


x x 

+ 


Fig.  3.1.  Mean  value  theorem  route 
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Then  two  uses  of  the  second  mean  value  theorem  for  one  variable, 
namely  u on  the  single -barred  segment  and  x on  the  double-barred 
segment,  give 


L - L 


, , 9L 

(x  - x ) 

+ - fix 


- (u, 


u 


9 L 

bu 


i(u4 


,2  a2L  1 

!<*- 

du 


- V 


82L 

ax2 


(3.1) 


The  bar  and  double -bar  over  the  second  derivatives  denote  evaluation 
at  different  and  unknown  intermediate  points  on  the  single-  and  double - 
barred  segments  respectively. 

In  our  function  space  setting  we  can  generalize  (3.1)  by  using  the 
abstract  form  of  Taylor  series  with  integral  remainder  given  by,  for 
instance,  Cartan  (1971),  p.  70,  and  Rail  (1969),  p.  124: 

1 

f(a  + h)  = f (a ) + f 1 (a ) • h 4 J (l  - t)f"(a  + thi(hi2dt  . 

0 

This  leads  to  the  following  generalization  of  (3.1)  for  the  case  of  the  two 
inner  product  spaces  introduced  after  (1.  2)  to  be 


L - L 


(x. 


<u4 


u > 


_9L 

9u 


) 


92l  , ,.  1 

- u_>  — (u+  ' UJ>  ' 2 (X+ 

9u 


92L 

2 |X4 
9x 


x_))  , (3.2) 
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i 


where 


9 L c 9 L 

r J (1  - t)  — - (x  ,u  + t(u  - u )jdt 
~ 2 ,2-’-  + 

9u  0 9u 

,X  1 , 2 

— 7 = f (1  - t)  — - (x  + t(x  - x ) , u )dt  . 

. 2 ^ „ 2 + - +’  + 

9x  0 OX 

In  specific  applications  these  can  be  replaced  by  the  appropriate  mean- 
value  theorem.  The  expression  on  the  left  in  (3.  2)  corresponds  to  the  saddle 
quantity  appearing  in  (2.1),  but  is  not  now  assumed  necessarily  to  be 
one -signed.  The  symbols  reminiscent  of  second  derivatives  on  the  right 
have  now  to  be  regarded  as  linear  operators  acting  on  the  elements  of  F 
or  E which  follow  them,  as  for  g'(u)  in  §§  2(vii)  and  (viii).  Further 
details  on  higher  derivatives  in  vector  spaces  are  described  by  Rail  (1969, 
§§18,  19). 

Note  that  a bilinear  functional,  like  (x,T  u)  or  the  Lagrangian 
which  generates  linear  programming  (Noble  and  Sewell,  op.  cit.  , § 10 (ii ) ) , 
has  only  mixed  second  derivatives,  and  so  contributes  identically  zero 
to  the  saddle  quantity.  This  is  evident  from  the  right  side  of  (3.2),  and 
can  be  verified  on  the  left  side  by  direct  substitution. 

For  certain  purposes,  ultimately  connected  with  embedded  problems, 
we  shall  also  draw  conclusions  from  mean  value  statements  for  gradients  of 
the  type 
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2 2 
9L  9L  9 L , , . 9 L 

- — - (x  - x ) + " — ; — (u  - u 

9x  9x  „ 2 + - dx9u  + 

+ - 9x 


(3.  3) 


9L 

0u 


9L 

9u 


dZL  . 92L 

(x  - X ) + (u,  - u 


0u8x 


9u 


2 + 


(3.4) 


Precise  statements  about  mean  value  theorems  for  operators  are  given  by 
Rail  (op.  cit.  , § 20).  Examples  of  second  derivative  operators  which  happen 
also  to  be  constant  are  found  in  the  problems  generated  by  (1.14)  and  (1.17), 
where 


A , 
= -pI’ 


A = * 

9x9u  ’ 


A 

9u2 


9ZL 

9u9x 


I 


(3.  5) 


(ii)  Boundedness  hypotheses 

We  have  now  removed  the  saddle  hypothesis  (2.1),  and  in  its  place 
we  begin  to  build  our  theoretical  framework  upon  the  following  boundedness 
hypotheses.  We  suppose  that  there  exists  a rectangular  domain  of  the 
product  space  E X F in  which  real  numbers  k^,  k^,  or  K^,  K^,  or 
both  pairs,  can  be  found  such  that 
(a)  for  each  given  u+  and  every  pair  x+,  x_ 


K |x  - x 
x + 


> -(x+ 


2 

x , (x+  - x_))  > kx  ||x+  - x_  11 2 

9x^ 


(3.6) 


(b)  for  each  given  x and  every  pair  u+,  u_ 

2 

K Hu  - u II 2 > < u - u , (u  - u )>  > k ||u  - u II2 

u + - — + - 7 2+  U + 

9u 


(3.7) 
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In  general  it  may  be  that  k^  and  depend  on  , and  that 

k and  K depend  on  x , and  we  sometimes  emphasize  this  by  writing 
u u 

k (u  ),  K (u  ) , k (x  ),  K (x  ) . (3.8) 

Of  course  they  may  sometimes  be  constant  over  the  domain.  We  allow 
that  these  bounds  (3.8)  may  have  either  sign.  For  guadratic  functionals 
such  as  (1.17)  there  exist  the  trivial  bounds 


K - k : P, 

x x ’ 


K = k = 1 . 
u u 


(3.9) 


Evidently  (3.2)  with  (3.6)  and  (3.7)  may  be  written 

B >L  - L -(x  - x ) - (u  - u , ) > b 

+ --  + - 4 -’ftx'  x 4 3u  — 4- 


(3.10) 


with  the  shorthand 


B = \ K (u  ) I [ x - x II2  4 ^ K (x  ) || u - u ||2  , 

b (u  ) II  x - x ||2  4 ^ k (x  ) II  u - u II2  . 

Sufficient  conditions  for  L[x,u]  to  be  a saddle  functional  concave 

in  x and  convex  in  u are  that 

k (u  ) > 0 and  k (x  ) > 0 
x 4 — u - — 


(3.12) 


(3.12) 


over  a rectangular  domain,  so  that  b > 0.  Then  (2.1)  follows  from  (3.10) 
without  need  of  and  K^.  [Alternatively  L is  convex  in  x and 

concave  in  u if  0 > K and  0 > K , without  need  of  k and  k . 

But  this  can  be  reduced  to  the  first  case  by  turning  the  saddle  functional 
upside  down]  . Sufficient  conditions  for  a strict  saddle  are  strict  inequalities 
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in  (3.12),  and  then  any  solution  xQ,  u(j  of  (1.13)  is  unique  (Sewell,  1969, 
inequality  (2.  52) ) . 

The  type  of  theory  presented  in  § 2 is  available  when  (3.12)  hold 
with  at  least  one  of  the  inequalities  strict. 

However,  there  are  many  situations  when  such  sufficient  hypotheses 
fail,  such  as  nonlinear  elasticity  or  nonconvex  optimization,  and  in  which 
there  may  still  be  a need  to  bound  linear  functionals.  There  will  now  be 
a different  linear  functional  to  bound  for  each  different  solution,  and  the 
needed  alternative  hypotheses  will  reflect  the  attention  which  must  be 
given  to  domain  boundaries  separating  the  individual  solutions. 

Consider  the  example  of  a function  of  two  real  variables 


,1  4 

~ 4 X 


1 2 


bx  + c)u  - px 


(3.13) 


with  scalar  coefficients  a,b,c,p.  For  this  K - k =0  and  (3.10)  reduces  to 

u u 

K > (3x^  + a)u  > k (3.14) 

x — + ~ x 

for  some  mean  x between  x and  x . Evidently  in  the  half- space  u+  > 0 

k (u  ) - au  , K does  not  exist  , (3.15) 

x + + x 

whereas  in  u < 0 k does  not  exist  but  K = au  . Therefore  a 
+ x x + 

sufficient  condition  for  (3.13)  to  be  a saddle  function  strictly  concave  in  x 

in  u^  > 0 is  a > 0 . (3.16) 

This  illustrates  (2.25). 
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When  a • 0,  however,  the  equation  bL/du  0 is  a quartic 
in  x which  might  not  have  a unique  solution.  Its  solutions  will  be 


separated  by  the  turning  points  of  the  quartic,  which  we  specified  by  the 


roots  of 

, 2 

0 L 3 

“ 0 i.e.  x + ax  + b 0 . 

dudx 


(3.17) 


The  required  new  hypotheses  would  avoid  such  roots.  It  is  no  accident 
that  (3.17)  is  also  the  equilibrium  surface  for  the  cusp  catastrophe  (e.g. 
see  Sewell,  1976,  for  diagrams  and  mechanical  examples),  and  further 
insight  can  be  obtained  by  pursuing  this  connection. 

At  this  point,  however,  we  have  said  enough  to  motivate  the  following 
choice  of  alternative  boundedness  hypotheses  required  when  the 
sufficient  saddle  hypotheses  (3.12)  fail.  We  assume  that  in  a rectangular 
domain  there  exist  real  numbers 

c^(ux)  > 0 and/or  c^fx  ) > 0 (3.18) 


depending  possibly  on  u4  and  x as  indicated,  such  that  for  every 
pair  of  points 


bZL 

~ — — (x 

8u3x  + 


|x^  - x 


(3.19) 


d L 

7^(U+-U_)  > cu  llu+  - u_  II  . (3.20) 

If  it  is  also  the  case  that  there  exist  real  numbers 

d (u  ) > 0 and/or  d (x  ) > 0 , (3. 21) 

X u 
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again  depending  possibly  on  u4  and  x , such  that  for  every  pair 
of  points 

dx  ||x+  - X _ II  > ~ (X+  - x_)  , 

bx 


(3. 22) 


dJK  - u II  > ^ (u+  - u_)  , 

3u 


(3.  23) 


then  Schwarz's  inquality  allows  the  values 


-k  - K = d >0 

X X X ~ 


(3.  24) 


-k  K ^ d >0 
u u u ~ 


(3. 25) 


for  the  numbers  (3.8).  Therefore  (3.  22)  and  (3.  23)  can  be  used  when 
the  sufficient  saddle  requirement  (3.12)  fails. 

(iii)  Error  estimates 

We  draw  some  conclusions  from  (3.10),  first  of  ail  without  any 


assumption  about  the  signs  of  k , k , K , K . 

x u x’  u 

(a)  Choosing  x4,u4  = x , u and  x , u = x^ , u^  (as  in  (2.  3)  but 

without  the  saddle  hypothesis)  implies 

Ba0  - La  ' L0  - ba0  * 

(b)  Choosing  x+,u+  ”x0>uo  and  x ,u  “xp’up  (as  in  (2.4)  but 

without  the  saddle  hypothesis)  implies 

B > L - L > b 

op  - o p - op 


(3. 26) 


(3.27) 
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J 


r 


(c)  Choosing  x+,u4  x^.u^  and  x.»u.  ^xp>up  implies 

B > L - L > b . 
aft  ~a  ft  a ft 

Then  (3.  26)  + (3.  27^  with  (3.  28)2  implies 

B„0  * % i La  - LH  - b«P 

and  (3.  261  4 (3.  27)^  with  (3.  28ij  implies 

% i Ln  • L|3  - b«0  4 b0f3  • 

These  last  two  inequalities,  with  (3.11),  can  be  regarded  as  composite 
error  estimates  for  the  solution  quantities  in  the  left  of  (3.  29)  or  the 
right  of  ( 3.  30).  For  example,  the  latter  written  explicitly  is 

L - L > \ k (u  ) || x -*■  x II2  + \ k (u  ) llx  - x ||2 
a 0 ~ 2 x <*  a 0 2x0  p u 


4 |ku(X0l«da-P0»2^l<u(it(i)l!u|5-U0l|2 


In  the  case  of  a saddle  functional  satisfying  (3.12)  with 

k (u  ) > 0 and/or  k (x  ) > 0 
x a up 

more  can  be  given  away  from  (3.  31)  to  imply 

— p-  (L  - L ) > llx  - x H 2 
k (u  ) a ft  a 0 


x ot 


and/or 


r . (L  - L ) > ||u  - u II 2 

k (xJ  a P p 0 

u ft 


(3. 28) 


( 3 . 29  ) 


(3.  30) 


(3.31) 


(3.  3 2 


(3.33) 


(3.  34) 
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These  simple  error  estimates  are  the  appropriate  generalizations  of  (2.8) 
and  they  improve  results  given  by  Zago  (1976,  Chapter  2). 

Next  suppose  that,  instead  of  the  sufficient  requirement  (3.12) 
for  a saddle  functional , both  (3.19)  - (3.  20)  and  (3.22)  - ( 3 . 2 3 1 hold. 

Then  the  triangle  inequality  applied  to  the  mean  value  statements  (3.  3) 
and  (3.4)  leads  to 

(3.35) 

(3. 36) 


-d  (u  ) lx  - x 


c (u  ) |x,  - x 
x + -*■ 


+ c (x  ) ij  u - u 
u - + 


d^(x  ) ||u^  - u ||  < 


b L 

b L 

dx+ 

bx 

bL 

SL 

t)U 

8u 

Because  the  basic  problem  (1.13)  is  stated  in  terms  of  gradients  (satisfying 

also  (2.  2)),  the  right  sides  of  (3.  35)  and  (3.  36)  can  be  regarded  as 

known,  in  particular  under  choices  of  the  disposable  plus  and  minus 

points  like  those  in  (a)  - (c)  above.  Therefore  these  inequalities  are  the 

basis  for  another  class  of  error  estimates  different  from  (3.  33)  and  (3.  34). 

Their  usefulness  may  depend  somewhat  on  the  extent  to  which  c , cu, 

d d are  actually  constant  over  the  considered  rectangular  domain, 
x’  u 

In  any  event,  if  it  is  also  true  that 

cc-dd>0,  (3.37) 

xu  xu 

(3.  35)  and.  (3.  36)  can  be  solved  to  give 


-38- 


IX  - X II  < — 
4 — c c 

X 


lu  - U II  < 

4 - — C C 

X 


i r ij|L_  jjl_ 

- d d u !9x,  dx 

u x u + 

1 T 3L  9L 

— q _ 

- d d x bx  bx 

u x u + 


3L  9 L b L 

4 c - 

bX  U Su  9u 


^ aL  bL 

x 9u+  9u 


These  can  be  substituted  back,  with  (3.  24)  and  (3.  25),  into  (3.  20)  to  give 
an  upper  bound  for  B,  and  a lower  bound  for  b^  . They  can  also  be 
substituted,  after  Schwarz's  inequality,  into  either  or  both  of  the  inner 
products  appearing  in  the  centre  expression  of  (3.10),  finally  giving  bounds 
for  what  is  left  there. 

For  example,  we  can  add 

K • xJ||lr||-(x4  *x-'^r)-*llx4  -x-lllterJ  (: 

to  (3. 10),  and  then  substitute  (3.  38)  and  (3.  39)  into  both  of  the  resulting 
bounds,  giving 


K - L-  - <-♦  • u_. 


n<liS  11  BL  BL  BL  BL 

— c c - d d C^u  9x  9x  Cu  9u  9u 
x u x u v-  + ' + 


(c  c - d d 1 
XU  XU 


1 u 

4-  — 

2 (c  c - d d )z 

XU  XU 


9L 

9L 

9L 

9L 

u 9x+ 

9x  1 

+ c 

u 

9u 

4 

9u_ 

9L 

8L 

L ,4 

9L 

9L 

x 9x+ 

3x 

+ Q 

X 

!au+ 

9u 

In  the  next  Section  we  give  an  example  of  this  result. 
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4.  General  Embedding  Method 

(ii  Embedding 

We  consider  now  a problem  given  ab  initio  in  the  form  (1.19).  We 
can  allow  that  it  be  decomposable  or  nondecomposable,  linear  or 
nonlinear.  Identify  the  variable  in  the  problem  with  the  x of  an  inner 
product  space  E,  and  suppose  the  operator  N ranges  in  a second  inner 
product  space  F.  The  typical  element  u of  F is  employed  in  the  role 
of  a Lagrangian  multiplier  by  constructing  the  functional 

L[x,u]  - -<u,N(x)>  + (p,x)  (4.1) 

where  p is  an  assignable  vector  in  E. 

Suppose  the  adjoint  N of  the  Gateaux  differential  N'  exists 
(Barnsley  and  Robinson  (1976)  give  technical  details  in  the  case  of  two 
Hilbert  spaces).  Then  the  gradients  of  L are 

“ = -N*(x)u  + p,  ~ = - N(x)  . (4.2) 

C7X  O li 

Then  (1.13)  in  the  form 

N (x)u  - p = 0 (a),  N(x)  = 0 ((3)  (4.3) 

contains  (1.19)  embedded  as  (4.3(3),  with  (4.3a)  as  an  auxiliary  equation. 

The  real  objective  now  is  to  estimate  the  linear  functional  (p,Xp), 
since  x^  is  a solution  of  the  ab  initio  problem,  and  p is  an  assignable 
vector. 

The  significance  of  the  result  (3.41)  for  this  purpose  is  that  because 
u appears  linearly  in  (4.1)  as  a Lagrangian  multiplier,  the  quantity 
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estimated  on  the  left  of  (3.41)  is 

DL 

L+  ■ L.  ' ^u+  " = -<u+,N(xf)  - N(x_)>  + (p,x+  - x ) . (4.4) 

Therefore  the  choice 


x+,u+  arbitrary,  x_  = x^  (4.5) 

introduces  the  objective  functional  (p,x^)  directly  into  (4.4),  which  becomes 

-<u+,N(x+)>  + (P,x+)  - (P.x^)  (4.6) 

since  Nfx^)  ; 0.  The  first  two  terms  in  (4.6)  are  assignable,  so  (3.4 1) 
gives  an  estimate  for  (p,Xp)  provided  the  hypotheses  leading  to  (3.41) 
can  be  verified. 

The  linearity  of  (4.1)  in  u allows 

du  = 0 (4.7) 


in  (3.  23)  and  (3. 25).  With 

d = ||u  If d > 0 (4.8) 

x + 

in  (3.24>,  the  constant  d corresponds  via  (3.22)  to  a bound  imposed  by 
Barnsley  and  Robinson  (op.  cit. ) on  the  second  derivative  of  the  operator  N(x). 
Suppose  there  exists 

c (u.)  > 0 (4.9) 

x + 


such  that,  for  all  x+ , u+ 

j| M(x  ) ||  = ||N(x  ) - N(x  ) ||  > c II x - x ||  14.10) 

+ + P — x + (3 

so  that  c^  in  (3.19)  is  effectively  a bound  on  the  first  derivative  of  N(x). 
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Because  of  (4.7)  neither  c nor  (3.  20)  are  required,  and  with  (4.  5)  the 

u ’ 

right  side  of  (3.41)  reduces  to 


0 L 1 

oL 

dx , ! 

0u^ 

t ^ ‘fi • ,4-ul 

x 

Substituting  (4.  2)  and  recalling  (4.6)  leads  finally  to  the  explicit  estimate 
|-<u+,N(x+)>  + (p,x+l  - (p,x  1 I 

: — ||  M ' (x  )u  - p |J  )|N(x  ) ||  + || u ||  ||N(x  i ||  2 . (4.12) 

— C ++  ■*-  2+  -*• 

x 2c 

x 

This  corresponds  to  the  result  of  Barnsley  and  Robinson  (op.  cit.  , 
inequalities  (3.6)),  but  is  obtained  here  from  a different  viewpoint. 

(ii)  Example 

We  can  indicate  very  briefly  how  the  procedure  works  by  referring 
to  the  algebraic  example  (3.13).  Then  (4.3)  becomes 

(x3  + ax  + biu  = p (a) , 4 x4  + 2 3x2  ^ bx  4 c “ 0 03)  • (4.13) 

To  satisfy  (4.10)  we  have  to  distinguish  not  more  than  four  domains  of 
the  x-axis,  separated  by  the  stationary  points  of  the  quartic.  In  any  such 
fixed  open  domain  the  slope  (and  therefore  the  cubic  coefficient  of  u in  (4.13a)) 
is  nonzero,  and  a bound  c for  it  can  be  determined.  Recalling  (3.14), 
the  right  side  of  (3.  22)  is 

|(3x2  + a)u+  (x+  - x ) | < |x+  - x | • |u+  I • max  |3x2  + a | , (4.14) 

and  therefore  in  (4.8)  we  choose  d = max|3x  + a I over  the  domain.  If 
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there  is  a solution  x of  the  quartic  in  that  domain,  a bound  for  px 

r H 

can  be  obtained  from  (4.12)  with  any  u4  and  any  x+ . The  first  term  on 

the  right  of  (4.12)  can  be  made  to  vanish  if  we  choose  the  arbitrary 


c ,u  = x ,u  , i.e.  (x'  + ax  + b)u  = p 
+ + a a a & oi 


(4.15) 


but  this  is  not  essential.  Improvement  of  the  bounds  is  another  matter, 
however,  and  Barnsley  and  Robinson  (op.  cit. ) mention  the  connection 
with  Newton's  method.  They  discuss  a particular  case  of  this  example 
in  which  a - 0,  b = c - - ~ , for  which  the  quartic  is  actually  convex. 


A 


5.  Nonlinear  Programming  Method 
(1)  Governing  conditions 

Suppose  that  the  basic  problem  (1.13)  is  now  replaced  by  a new  problem 
governed  by  the  following  different  conditions,  but  again  generated  from 
a given  scalar  functional  L[x,u]  of  the  elements  of  two-inner  product 
spaces  E and  F. 


< 0 

&x  - ’ 


x > 0 , 


, 9L . 

|x’  i?  = 0 ■ 


— > 0 
3u  - ’ 


(«) 

(P) 


(5.1) 


J 


u > 0 


= 0 


(pi 

(a)  ) 


(5.  2) 


J 


The  presence  of  inequalities  of  course  implies  that  the  elements  of  E 

and  F are  built  up  ultimately  from  real  numbers  (e.g.  via  the  individual 

entries  in  real  matrices),  to  which  the  inequalities  are  applied.  In  other 

words,  all  elements  are  ordered  so  that  the  inequalities  are  defined. 

These  governing  conditions  have  again  been  divided  into  two  subsets 

labelled  (a)  and  (f?)  (and  a third  unlabelled  subset,  of  'orthogonality 

conditions').  A point  x ,u  now  denotes  any  solution  of  (5.1a)  + (5.  2a), 

a a 

and  a point  x ,u  is  any  solution  of  (5.ip)  + (5.2P). 

P P 
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When  L[x,u]  is  a saddle  functional  in  the  sense  of  (2.1),  the 


choice  (2.  3)  implies  the  minimum  principle 

L - ( x , ) - L > - ( x , 7^— ) + < u , > 0 . (5.3) 

a a’  dx  0 — 0’  9x  a 8u. 

a a U 

On  the  other  hand,  the  choice  (2.4)  in  (2.1)  implies  the  maximum  principle 

V *»■  <5-41 

Therefore,  in  place  of  (2.  5)  we  have  the  following  dual  extremum  principles 

L - ( x , — ) > L > L - ( u — ) (5.5) 

or  v a’  9x  ' - 0 - (3  P’  0UQ 

o P 

proved  in  Sewell  (197  3a,  § lie) . The  extrema  are  not  in  general  stationary. 

A suffix  zero  refers  to  a solution  value  for  the  whole  problem  (5.1)  + (5.2). 

(iii)  General  bounds  for  linear  functionals 


In  place  of  (2.9),  choose 

x,  arbitrary,  any  u+  >0,  and  x ,u  =xQ,uQ  . 

The  consequent  (2.1),  when  added  to  (5.4)  to  eliminate  L^,  is 

8L  . 8L  . 


(5.6) 


(5.7) 
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The  left  side  is  a supposedly  known  estimate  for  either  of  the  two  linear 
functionals  of  x()  and  uQ  on  the  right. 

Instead  of  15.6),  modify  (2.11)  to  choose 

x,u,  x0>uq  anci  anY  x arbitrary  u . (5.7) 


The  consequent  (2.1), 


when  added  to  (5.  3)  to  remove  L , 

L - (x  , 7 ) - L + (u  , ~ > 

a a dx  - - du 

a 


dL  , , 9L  s 
2-  ,xo-  <u0’  »u  } 


is 


Xu  , — > . (5.8) 

- 0 ’ du 

Again  the  left  side  is  a supposedly  known  estimate  for  either  of  the  two 
linear  functionals  of  and  u^  on  the  right. 

The  general  bounds  (5.7)  and  (5.8)  are  extensions  of  (2.13)  and 
(2.14).  Optimization  of  them  is  unexplored,  but  their  brevity  warrants 
their  inclusion  here,  for  completeness . 

(ivi  Embedding  method 

When  an  ab  initio  problem  (1.19)  contains  an  operator  Mxi  which 
happens  to  be  convex  in  some  domain,  then  we  can  construct  a functional 

(4.1)  for  which  the  left  side  of  (2.1)  is 

<u(,N(x  ) - N(xJ  - N'(x+)(x_  - x+)>  (5.9) 

which  will  be  nonnegative  in  the  half-space  u4  > 0,  as  in  the  case  of  (2.  25). 
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Then  (4.1)  is  a saddle  functional.  This  suggests  seeking  to  embed  the 


problem  in  a variant  of  (5.1)  and  (5.  2),  namely 


N (x)u  + p 0 , 

(a) 

(5.10) 

- N(x)  > 0 , 

<P) 

A 

u > 0 , 

(a) 

> (5.11) 

-<u,N(x)>  0 . 

J 

' 

Thus  we  take  (1.13a)  with  (5.  2).  The  orthogonality  condition  is  taken  to 
imply  that  N(x)  = 0 whenever  the  strict  inequality  u > 0 holds,  and 
in  that  sense  the  embedding  is  achieved. 

The  objective  now  is  therefore  to  bound  (p,xQ)  corresponding  to 
U()  > 0 in  the  actual  solution  of  (5.10)  with  (5.11)  (and  not  to  bound 
(p , Xp)  as  in  §4,  because  (5. lip)  is  not  itself  the  ab  initio  problem).  The 
dual  extremum  principles  (5.  5)  still  apply,  and  for  (4.1)  become 

-<u  , N(x  )>  + (p,x  ) > (p,x  ) > (p,x  ) . (5.12) 

a a a — U (3 

These  are  themselves  the  required  bounds.  The  bound  on  the  right  is  not 
necessarily  stationary  because  possibly  first  order  terms  have  been  given 
away  in  its  derivation,  but  it  may  be  easy  to  find. 

In  the  algebraic  example  (3.13),  the  ab  initio  problem  was  the  quartic 


N(x)  =-;x4  + ; ax2  + bx  + c = 0 . (4. 13P) 

4 2 

There  are  either  one  or  two  domains  in  which  it  is  convex,  and  at  most  one 
domain  for  which  it  is  concave  (for  which  case  the  function  can  first 
be  turned  upside  down  before  applying  the  procedure).  In  a convex  domain 
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the  bounds  (5.12)  read 


,14  1 £. 

u (7  x + — ax  + bx  4 c)  + px  > px„  > pxr 

ex  4 ,y  / ry  /v  ex  — f)  — ( 


(5.13) 


where  x^  is  any  solution  of 


14  12,. 

- X|3  + i aX(J  t bX(5  t c < 0 


(5.14) 


and  x is  anything  for  which 

a 


3 

(x  4 ax  4 b)u  - p,  u >0 
a a a a ~ 

can  be  satisfied.  Evidently  the  cubic  coefficient  ought  not  to  vanish  in 
(5.15),  and  (4.9'  is  a formal  way  of  avoiding  this. 

The  embedding  of  ab  initio  linear  equations  can  also  be  illustrated, 
either  via  a form  of  (4.10),  or  by  embedding  in  a linear  programming 
problem  (cf.  Noble  and  Sewell,  op.  cit.  § 1 0 (ii ) > . Linear  problems  in 
which  the  operator  has  special  structure  have  been  discussed  by  Barnsley 
and  Robinson  (1974,  1975/6). 


(5.15) 
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6,  Applications 


(i(  Introduction 


We  have  carried  out  some  preliminary  calculations  applying  the 


general  optimization  method  of  § 2.  These  include  an  analysis  of  an 
electrical  network  with  resistors  having  nonlinear  voltage -current  relation- 
ships, and  a verification  that  a basis  used  by  Martin  (1964,  inequality  (21)) 
for  displacement  bounds  in  elastic  bodies  under  certain  dynamic  conditions 
is  a consequence  of  ideas  like  those  of  (2.13)  or  (2.14)  above.  Barnsley 
and  Robinson  (1976)  illustrate  the  result  (4.12)  by  applications  to  a non- 
linear integral  equation  in  communication  theory,  and  a nonlinear  differential 
equation  in  a thermal  problem.  Fujita  (op.  cit.)  mentions  examples  for 
the  linear  problem  (1.1). 

We  have  concluded,  however,  that  a fully  representative  illustration 
of  the  optimization  method  merits  a separate  investigation  which  we  ought 
not  to  attempt  here.  A comparative  study  of  the  relative  merits  and  power 
of  the  three  methods  described  in  § § 2 , 4 and  5 must  also  await  the  study 
of  a number  of  examples. 

The  main  objective  of  the  present  paper  has  been  to  establish  some 
perspective  by  trying  to  uncover  the  structure  of  the  requisite  general 
theories.  One  may  anticipate  that  in  some  later  instances  more  rigorous 
statements  may  be  required,  but  we  have  net  conceived  that  to  be  necessary 
for  our  purpose  here. 
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ui  X nlinear  cantilever  beam 


In  particular  it  is  by  no  means  clear  from  the  literature  that  one- 
dimensional problems  are  genuinely  representative  of  a theory  which  is 
to  estimate  pointwise  bounds.  Nevertheless  it  is  a natural  engineering 
starting  point,  and  we  conclude  the  paper  by  giving  the  reader  a handle 
to  the  machinery  of  § 2 in  such  a case.  This  example  was  examined  by 
Martin  (1966)  by  an  ad  hoc  engineering  analysis,  and  a description  from 
scratch  of  some  of  its  connections  with  the  present  theoretical  framework 
was  given  by  Noble  (1974)  at  an  earlier  stage  of  this  research. 

(iii)  Hamiltonian  representation  of  the  beam  problem 

We  first  show  how  the  elementary  governing  equations  of  the  problem 
can  be  cast  into  the  Hamiltonian  form  (1.16).  This  will  illustrate  how 
the  appropriate  spaces  and  operators  can  be  constructed  ab  initio 
in  a one-dimensional  problem.  Corresponding  material  in  three-dimensional 
boundary  value  problems  of  elasticity  and  plasticity  was  given  by  Sewell 
(1973a, b). 

We  consider  a thin  straight  cantilever  beam  made  of  nonlinear  material. 
After  conversion  to  nondimensional  variables,  let  s denote  distance 
measured  along  the  beam  from  the  built-in  end  s = 0 to  the  free  end  s 1. 
Suppose  the  beam  is  loaded  transversely  in  a plane  by  a load  w(s)  per 
unit  length.  (See  Fig.  6.1).  The  transverse  small  deflection  (or  deflection  - 
'rate')  in  the  direction  of  w(s)  is  denoted  by  u(s),  and  M(s)  is  the 


L 
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Fig.  6.1.  Continuously  loaded  thin  cantilever 


internal  bending  moment.  With  an  appropriate  sign  convention,  elimination 
of  the  transverse  internal  shear  force  by  differentiation  leads  to  the  single 
equilibrium  equation 


The  boundary  conditions  will  be 


u(0)  = ~ = 0,  M(l)  = ~ = 0 . 

ds|Q  ds  \l 


The  material  is  supposed  to  respond  according  to  the  'creep  law’ 


^ = Mn 
ds^ 

for  some  given  n. 

Our  purpose  in  this  sub- section  is  to  express  (6.1)  - ^6.3)  in  the 


formalism  of  (1. 16).  The  space  E is  chosen  to  consist  of  matrices  like 


M = M (0) 


constructed  from  real  integrable  functions  M(s),  the  three  entries 


being  associated  respectively  with  the  interior  and  with  the  two  end- 
points s 0 and  s 1 of  the  beam.  (The  identity  symbol  emphasizes 
a definition.  ) The  inner  product  for  E is  defined  as 

1 

(M,N  )=/  M(s)N(s)ds  + M(0)N(0)  (6.5) 

0 

for  any  two  members  M and  N of  E.  The  space  F is  chosen  to 
consist  of  matrices  like 


u 


(6.6) 


constructed  from  real  integrable  functions  u(s),  the  three  entries  again 

being  associated  with  the  interior  and  the  end-points  of  0 < s < 1, 

with  the  same  ordering  as  in  (6.4).  The  inner  product  for  F is  defined  as 


1 

<u,v)  = f u(s)v(s)ds  + u(l)v(l)  , (6.7) 

0 

for  any  two  members  u and  v of  F.  Notice  that  there  is  a slight 
clash  between  the  notation  M,  u just  introduced  by  the  definitions  (6.4) 
and  (6.6)  for  elements  of  the  spaces,  and  the  conventional  way  in  which 
the  real  scalar  functions  M(s),  u(s)  have  been  abbreviated  in  (6.1)  - (6.  3) 
by  omitting  explicit  mention  of  the  argument  s.  This  need  not  cause 
confusion. 


. 
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It  would  have  been  possible  to  redefine  E and  F,  by  replacing 
the  zero  value  entries  in  (6.4)  and  (6.6)  by  the  values  MM)  and  u(0i 
(respectively)  of  the  considered  integrable  functions,  regarding  these 
values  as  unassigned  at  this  stage  (they  would  later  be  given  zero  values 
in  the  subspaces  E'  and  F’).  Then  M(1)N(1)  could  have  been  added  to 
the  definition  of  (M,N),  and  u(0)v(0)  to  that  of  <u,v>.  The  two  inner 
product  spaces  would  then  in  fact  be  the  same  space.  But  there  is  no 
advantage  in  that,  for  we  shall  next  be  obliged  to  consider  subspaces  E' 
and  F'  which  are  not  the  same.  In  any  event,  from  the  viewpoint  of 
general  theory  it  is  more  fruitful  to  regard  the  presence  of  two  (occasionally 
more)  distinct  spaces  as  the  rule,  and  their  coincidence  as  an  exception. 

The  subspace  E'  is  now  defined  to  consist  of  those  elements  (6.4' 
of  E which  are  constructed  from  functions  (typically  M(si)  which  are 
not  merely  integrable,  but  also 

are  single- valued  and  continuous,  with  continuous  first  derivatives, 
in  0 < s £ 1 ; 

have  piecewise  continuous  second  derivatives  in  0 < s < 1; 
have  zero  values  at  s = 1,  e.g.  M(l)  = 0. 

The  subspace  F'  is  defined  to  consist  of  those  elements  (6.6)  of  F which 
are  constructed  from  functions  (typically  u(s))  which  again  are  not  merely 
integrable,  but  also 
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are  single-valued  and  continuous,  with  continuous  first  derivatives, 
in  0 < s < 1: 

have  piecewise  continuous  second  derivatives  in  0 < s < 1; 
have  zero  values  at  s 0,  e.g.  u(0)  - 0. 

The  last  properly  in  each  of  these  definitions  shows  that  £'  * T',  even 
though  we  could  have  chosen  E F as  described  above. 

We  can  now  define  operators  T and  T mapping  according  to 

(1.  3)  by  the  matrices 

d2u 


T u h 


ds 

du 

ds 

0 


(6.8' 


T M h 


d2M 

ds2 

0 

dM 

ds 


It  can  be  verified  that  the  statement  (1.41  of  adjointness,  namely 

(M , T u)  - (u,TM) 

for  all  u in  F'  and  for  all  M in  E',  here  represents  a double 
integration  by  parts  written  as 


1 

d u 


/ M ^ ds  + M(0) 


ds 


du 

ds 


> iZU  . ...  dM 

= / ds  -u(1'  ds 

0 0 ds 


(6.9) 


(6.10) 


(6.11) 
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The  jumps  allowed  in  the  second  derivatives  do  not  affect  the  validity 


of  this,  and  the  properties  M(l)  = 0 = u(0)  of  the  subspaces  have  been  used. 
Finally  we  can  introduce  the  Hamiltonian  functional 

X[M,u]  - / r~-r  Mn+1  + uwlds  . (6.12) 


This  has  no  boundary  terms,  which  is  exceptional,  and  so  its  gradients  are 


ax 

SM 


M 

0 

0 


(6.13) 


ax 

au 


w 

0 

0 


The  equations  (1.16)  now  appear  as 


(6.14) 


T 


(a)  , 


(6.15) 


TM  =—  ((3) 

du 

which  can  be  seen  as  an  alternative  statement  of  the  original  equations 
(6.1)  - (6.3),  bearing  in  mind  also  the  properties  of  E'  and  F'. 

The  problem  is  thus  generated  from  equations  (6.15)  by  the  Hamiltonian 
functional  X[  M , u]  of  (6.12),  which  is  strictly  convex  in  M if  n is 
an  odd  integer  (or  convex  in  the  half- space  M > 0 if  n is  any  integer), 
and  linear  in  u.  A classical  elastic  beam  has  n = 1. 


Thus  the  problem  is  very  similar  to  the  Fujita  problem  (1.14)  when 
n 1,  or  its  generalization  (2.  26)  when  n > 1,  except  that  the  role 
of  the  variables  is  reversed.  In  applying  the  general  theory,  we  therefore 
expect  to  have  a case  of  intermediate  generality  like  that  of  subsections 
2(vi)  - (viii). 

(iv)  Lagranqian  generating  functional 

Evidently  (6.15a)  can  be  regarded  as  the  'constitutive  equation',  and 
(6.15(3)  as  the  'equilibrium  equation'.  The  quote  marks  remind  us  that 
these  equations  in  fact  contain  some  of  the  boundary  conditions  embedded 
in  them  as  well.  The  equations  can  also  be  regarded  as  generated  from 
(1.13)  via  the  Lagrangian  functional 

* r1  r 1 n+1  1 

L[  M,u]  = (M , T u)  - J M + uwjds 


1 H 

= / M^ds  + M(0)^  -X(M,u] 

0 ds  S 0 

= (u,TM>  - X[  M , u] 

r1  d^M  , ...  dM  vr  . . , 

= J u ds  - u (1)  -X[M,u] 

0 ds  1 

In  other  words,  the  constitutive  equations  may  be  derived  as 


d u . .n 
— -M 

ds 


ol  * ax  . 

= T u - — rr  = du 

3M  9M  — 
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and  the  equilibrium  equations  as 


— TM  - ~ = 0 

su  au 


The  underdetermined  class  of  solutions  M , u of  (6.17a),  and 

a a 

Uj  of  (6.17 (3),  are  generated  from  (6.4)  and  (6.6)  as  follows.  The 

element  u must  be  constructed  from  a function  u (s)  which  satisfies 

a a 

u (0)  =0  because  it  must  belong  to  the  domain  F1  of  T . By  a 

a 

double  integration  of  (6.17a)  with  any  integrable  function  M^(s),  using 


dummy  variables  <r  and  t,  we  have 


u (s ) - f f [M  (<r> ] ndfrdt 
a J J a 

0 0 


The  element  M must  be  constructed  from  a function  M (s)  which 
p P 

satisfies  M (1)  = 0,  because  it  must  belong  to  the  domain  E'  of  T. 
P 

By  a double  integration  of  (6 . 1 7 (3 ) with  any  integrable  loading  function 


w(s),  we  have 


s t 

M (s)  = / J w(tr)dcrdt 
P 1 1 

1 2 

= — w(s  - 1)  if  w(cr)  = constant 


Nothing  need  be  said  about  a function  Up(s)  or  associated  element 
u^^,  because  this  is  absent  from  (6.170). 


-57- 


r 


The  total  potential  energy  associated  with  any  such  ^-solution  is 


( M , ~r>  - x[  m ,u  ] 
a a M 1 a’  a 


n . ,n+l 

r M - u w 

n + I a of 


(6. 20) 


n ..n+1  ..  , ,n 

r M - 

n + 1 o<  p a 


ds  . 


The  total  complementary  energy  associated  with  any  such  p- solution  is 

-V-<V^>  + x[mp'V 


= / 7~r  M“  Ms 

n + 1 


■r 


(6. 21) 


Any  actual  solution  value  of  L is 


Ln  -f 


n , .n+1 

7 - u^w 

n + 1 0 0 


ds  = - j — ~ Mj?  ' *ds 
n + 1 0 


■-/' 


w , 

“o  ~ ds  ■ 


(6.22) 


In  other  words,  LQ  is  itself  a linear  functional  of  uQ . 

When  n is  odd,  Lf  M,u]  is  a saddle  functional  concave  in  M 
and  linear  in  u,  and  the  standard  energy  methods  are  applications  of 
the  extremum  principles  (2.  5)  with  these  specific  expressions  (6.18)  - (6. 21). 
The  difference  between  the  energy  bounds  for  odd  n is 
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R7 


L - L — 
a (3  n + 


- j [ nMU(M  - M ) - M,(Mn  - Mjlds  > 
1 J ' 1 a a p pa  PJ- 


1 1 2 

— f (M  M.)  ds  when  n 1 . 
2 0 " P 


(6.22) 


(vi  A strong  bound 

Recalling  the  remarks  of  § 2(x)  and  noticing  that  (6.  22)  reduces  LQ 
itself  to  a linear  functional  of  u^ , we  enquire  if  a strong  bound  can 
be  constructed  for  the  beam  problem. 

Substitution  of  the  Lagrangian  (6.16)  into  (2  12)  leads  to 


n + 1 


/ 

0 


M 


n+1 


ds  > / 


rd2M 


.ds 


nw 
n + 1 


dM 

|ds  - u (1)  — — 
0 ds 


(6.23) 


after  using  the  matrix  expression  (6.17P)  for  the  nonzero  gradient  dL/du_ 
in  the  inner  products  defined  via  (6.7). 

Since  the  minus  point  in  (2.12)  is  arbitrary,  so  is  the  bending  moment 
distribution  M (s)  except  that  it  must  be  in  E',  the  domain  of  T 
specified  above  for  this  problem. 

This  makes  a precise  connection  with  Martin's  (1966)  result,  if  we 
now  choose  M (s)  to  be  in  equilibrium  with  the  fictitious  loading 
distribution  shown  in  Fig.  6. 2.  That  is  to  say,  M_(s)  is  to  satisfy 


d2M 


dM 

ds 


nw(s) 
n + 1 


in  0 < s < 1 


at  s = 1 


(6 . 24 ) 


(6.26) 
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1 


Fig.  6.2.  Fictitious  loading  on  cantilever 


in  addition  to  M (1)  0 already  required  by  the  subspace  E ' , where  P 
is  a given  number.  Such  a choice  is  made  because  it  allows  the  pointwise 
estimate 


V11  - (n  + 1)P 


/ M^+1ds 


( 6 . 26 ) 


to  be  obtained  from  (6. 23)  for  the  deflection  uQ(l)  at  the  end  of  the 
beam  under  the  actual  loading  of  Fig.  1. 

When  the  distributed  load  w is  uniform,  the  required  solution 
of  (6.  24)  is 


M_(s)  = rT+T  2 W(1  " S>2  + P(1  ' S)  ’ (6.27) 

whence  (6.26)  becomes  Martin's  (1966)  pointwise  estimate  (12). 

(vi)  Weak  bounds 

It  is  possible,  for  example,  to  optimize  (2.14)  by  identifying  the 
above  choice  of  M with 

M_  = Mp  + hp  (6. 28) 
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where  h is  a scalar  and  p a member  of  E'.  But  we  already  know  that, 
in  a sense,  more  than  necessary  has  been  given  away  in  this  particular 
problem,  and  the  result  does  not  seem  to  be  helpful.  For  example,  in 
the  case  n 1 we  arrive  at 


i i 


.1 

f (M  - M Ip  ds 

r-  1 „ 

2. 

rib 

< 

1 (M  - M ) ds 

/ P2ds 

{,  0 13 

Lo  » 13  _ 

Lo  J 

A reason  why  we  said  that  this  problem  has  only  limited  representative 
value  can  now  be  seen.  It  is  because,  since  we  are  at  liberty  to  choose 

any  M (s)  for  insertion  into  (6. 18 ) , and  since  we  know  M (s ) from 

or  P 

(6.19),  we  can  choose 

M (s ) - M (s ) . (6. 30i 

a p 

This  is  the  perfect  choice  bearing  no  margin  of  error  in  (6.  29),  and  in  fact 
corresponds  to  the  exact  solution  M (s).  The  exact  solution  when  n 1 
and  w - constant  is 

M (s)  = ^ w(s  - l)2 

1 2 2 <6'311 
uds)  - T7  ws  <s  - 4s  + 6)  . 

0 24 
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